NTCIR-3 CLIR Experiments at MSRA
نویسندگان
چکیده
This paper describes three statistical models for the purpose of resolving query translation ambiguity for cross-language information retrieval (CLIR). First, a decaying co-occurrence model is present. It is an extension of traditional co-occurrence models in that it contains a decaying factor which decreases the mutual information when the distance between the terms increases. Second, a phrase translation model is described aiming to detect and translate noun phrases that are not stored in the dictionary. Finally, a triple translation model is proposed which provides a way of exploiting linguistic dependency information. We show experimentally improvements of using these models on TREC and NTCIR corpus.
منابع مشابه
NTCIR-3 CLIR Experiments at Osaka Kyoiku University - Comparison of Gram-based Indices
Long gram-based indices are experimented at NTCIR-3 CLIR task. To make gram-based indices, no analyses such as morphological ones are required. Indices in three languages (i.e. Japanese, English and Chinese) are made at this task. They are quite different in some point. The difference of index overhead comes from the difference of character code for example.
متن کاملOverview of CLIR Task at the Fifth NTCIR Workshop
The purpose of this paper is to overview research efforts at the NTCIR-5 CLIR task, which is a project of large-scale retrieval experiments on cross-lingual information retrieval (CLIR) of Chinese, Japanese, Korean, and English. The project has three sub-tasks, multi-lingual IR (MLIR), bilingual IR (BLIR), and single language IR (SLIR), in which many research groups from over ten countries are ...
متن کاملOverview of CLIR Task at the Sixth NTCIR Workshop
The purpose of this paper is to overview research efforts at the NTCIR-6 CLIR task, which is a project of large-scale retrieval experiments on cross-lingual information retrieval (CLIR) of Chinese, Japanese, Korean, and English. The project has three sub-tasks, multi-lingual IR (MLIR), bilingual IR (BLIR), and single language IR (SLIR), in which many research groups from ten countries or region...
متن کاملOverview of CLIR Task at the Third NTCIR Workshop
This report is an overview of Cross-Language Information Retrieval Task (CLIR) at the third NTCIR Workshop. There are 3 tracks in CLIR: Single Language IR (SLIR), Bilingual CLIR (BLIR), and Multilingual CLIR (MLIR). The scope, schedule, test collections, search results, relevance judgment, scoring results, and the preliminary analyses are described in the report.
متن کاملNotes on the Limits of CLIR Effectiveness: NTCIR-2 Evaluation Experiments at Justsystem
NTCIR-2 evaluation experiments at the Justsystem site are described with a focus on comparative study of CLIR effectiveness with monolingual retrieval effectiveness of the same retrieval engine. Experiments on the effects of phrasal translation, indexing of translated phrasal terms, pre-translation feedback and parallel documents feedback in diverse retrieval settings, are reported. The results...
متن کامل